Robust M-estimators and Machine Learning Algorithms for Improving the Predictive Accuracy of Seaweed Contaminated Big Data
نویسندگان
چکیده
A common problem in regression analysis using ordinary least squares (OLS) is the effect of outliers or contaminated data on estimates parameters. robust method that not sensitive to and can handle needed. In this study, objective determine significant parameters moisture content seaweed after drying develop a hybrid model reduce outliers. The were collected with sensors from v-Groove Hybrid Solar Drier (v-GHSD) at Semporna, South-Eastern Coast Sabah, Malaysia. After second order interaction, we have 435 parameters, each parameter has 1914 observations. First, used four machine learning algorithms, such as random forest, support vector machine, bagging boosting by selecting 15, 25, 35 45 Second, developed methods M. Bi-Square, Hampel Huber. results show there improvement reduction number better prediction for big data. For highest variable importance seaweed, M Bi-square performs because it lowest percentage 4.08 %.
منابع مشابه
Improving the Performance of Machine Learning Algorithms for Heart Disease Diagnosis by Optimizing Data and Features
Heart is one of the most important members of the body, and heart disease is the major cause of death in the world and Iran. This is why the early/on time diagnosis is one of the significant basics for preventing and reducing deaths of this disease. So far, many studies have been done on heart disease with the aim of prediction, diagnosis, and treatment. However, most of them have been mostly f...
متن کاملParallelizing Big Data Machine Learning Algorithms with Model Rotation
This paper investigates a novel approach to parallelization of machine learning algorithms using model rotation as an effective parallel computation model. We identify the importance of model rotation owing to its ability to shift the latest model updates to a neighboring computation, thereby guaranteeing model consistency which is hard to achieve in other computation models. We distinguish com...
متن کاملan investigation into the impact of m-game-enhanced blended module of teaching and learning on iranian students english literacy skills and subskills learning
پژوهش حاضر با پیوند رسانه های قدیمی و جدید یاد دهی و یادگیری _طرح داستان و بازی های همراه ــ در یک پو دمان ترکیبی، در صدد قیاس شیوه ی یاد دهی و یادگیری مبتنی بر بازی مهارت های فرعی و اصلی واژگان، خواندن و نوشتار سواد انگلیسی با شیوه های مرسوم آن بود. به این منظور با کاربرد یک طرح سه گانه همراه با الگوی نظام آموزشی (تومی، 2010)، بازی های از پیش ساخته شده و بومی قابل عرضه از طریق ارتباطات سیّار (ب...
Exploiting Unlabeled Data for Improving Accuracy of Predictive Data Mining
Predictive data mining typically relies on labeled data without exploiting a much larger amount of available unlabeled data. The goal of this paper is to show that using unlabeled data can be beneficial in a range of important prediction problems and therefore should be an integral part of the learning process. Given an unlabeled dataset representative of the underlying distribution and a K-cla...
متن کاملSpatiotemporal Estimation of PM2.5 Concentration Using Remotely Sensed Data, Machine Learning, and Optimization Algorithms
PM 2.5 (particles <2.5 μm in aerodynamic diameter) can be measured by ground station data in urban areas, but the number of these stations and their geographical coverage is limited. Therefore, these data are not adequate for calculating concentrations of Pm2.5 over a large urban area. This study aims to use Aerosol Optical Depth (AOD) satellite images and meteorological data from 2014 to 2017 ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Nigerian Society of Physical Sciences
سال: 2023
ISSN: ['2714-4704']
DOI: https://doi.org/10.46481/jnsps.2023.1137